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Abstract 

We prove that NP ^ coNP and coNP <t MA in the number-on- 
forehead model of multiparty communication complexity for up to k = 
(1 — e)logn players, where e > is any constant. Specifically, we 
construct a function F : ({0, l} n ) k — > {0, 1} with co-nondeterministic 
complexity O(logn) and Merlin- Arthur complexity nP^. The problem 
was open for k ^ 3. 

1 Introduction 

The number-on-forehead model of multiparty communication complex- 
ity [CFL] features k communicating players whose goal is to compute a 
given distributed function. More precisely, one considers a Boolean function 
F : ({0, l} n ) k — > { — 1,-1-1} whose arguments x\,. . . ,Xk G {0, l} n are placed 
on the foreheads of players 1 through k, respectively. Thus, player i sees all 
the arguments except for Xj. The players communicate by writing bits on 
a shared blackboard, visible to all. Their goal is to compute F(x\, . . . ,Xk) 
with minimum communication. The multiparty model has found a variety 
of applications, including circuit complexity, pseudorandomness, and proof 
complexity (YJ IHGl IBNSl IRW] [BPS]. This model draws its richness from 
the overlap in the players' inputs, which makes it challenging to prove lower 
bounds. Several fundamental questions in the multiparty model remain open 
despite much research. 
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1.1 Previous Work and Our Results 

The /c-party number-on-forehead model naturally gives rise to the complex- 
ity classes NP£ C , coNP£ c , BPP£ C , and MA£ C , corresponding to communication 
problems F : ({0, l} n ) k — > { — 1,+1} with efficient nondeterministic, co- 
nondeterministic, randomized, and Merlin-Arthur protocols, respectively. 
An efficient protocol is one with communication cost log°^ n. Determining 
the exact relationships among these classes is a natural goal in complexity 
theory. 

For example, it had been open to show that nondeterministic protocols 
can be more powerful than randomized, for k ^ 3 players. This problem 
was recently solved in \LS\ ICA] for up to k = (1 — o(l)) log 2 log 2 n players, 
and later strengthened in [DPj to k = (1 — e) log 2 n players, where e > is 
any given constant. An explicit separation for the latter case was obtained 
in [DPVj . 

The contribution in this paper is to relate the power of nondeterministic, 
co-nondeterministic, and Merlin-Arthur protocols. For k = 2 players, the 
relations among these models are well understood |KN} IK2| : it is known 
that coNPf ^ NPf and further that coNPf £ MAf . Starting at k = 3, 
however, it has been open to even separate NP£ C and coNP£ c . Our main 
result is that coNP£ c ^ MA| C for up to k = (1 — e) log 2 n players, where e > 
is an arbitrary constant. The separation is by an explicitly given function. 
In particular, our work shows that NP| C ^ coNP| c and also subsumes the 
separation in [DPI IDPV] . since UPf C MAf and BPPf C MAf. Let 
the symbols N(F), N(—F), and MA(F) denote the nondeterministic, co- 
nondeterministic, and Merlin- Arthur complexity of F in the /c-party number- 
on-forehead model. 

Theorem 1.1 (Main Result). Let k ^ (1 — e)log 2 n, where e > is any 
given constant. Then there is an (explicitly given) function F : ({0, l} n ) k — > 
{-1,+1} with 

N(-F) = O(logn) 

and 

MA(F) = n n ^\ 
In particular, coNP£ c £ MA c k c and NP£ C / coNP£ c . 

It is a longstanding open problem to exhibit a function with nontrivial 
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multiparty complexity for k log 2 n players. Therefore, the separation in 
Theorem II. II is state-of-the-art with respect to the number of players. 

The proof of Theorem 11.11 to be described shortly, is based on the pat- 
tern matrix method \S1\ IS2| and its multiparty generalization in [DPV] . In 
the final section of this paper, we revisit several other multiparty general- 
izations [Cj ILSl \CA\ [BH] of the pattern matrix method. By applying our 
techniques in these other settings, we are able to obtain similar exponen- 
tial separations by functions as simple as constant-depth circuits. However, 
these new separations only hold up to k = e log n players, unlike the sepa- 
ration in Theorem ll.il 

1.2 Previous Techniques 

Perhaps the best-known method for communication lower bounds, both in 
the number-on- forehead multiparty model and various two-party models, is 
the discrepancy method [KNJ. The method consists in exhibiting a distri- 
bution P with respect to which the function F of interest has negligible 
discrepancy, i.e., negligible correlation with all low-cost protocols. A more 
powerful technique is the generalized discrepancy method |KH IR3| . This 
method consists in exhibiting a distribution P and a function H such that, 
on the one hand, the function F of interest is well-correlated with H with re- 
spect to P, but on the other hand, H has negligible discrepancy with respect 
to P. 

In practice, considerable effort is required to find suitable P and H and 
to analyze the resulting discrepancies. In particular, no strong bounds were 
available on the discrepancy or generalized discrepancy of constant-depth 
circuits AC . The recent pattern matrix method \S1\ IS2j solves this problem 
for AC and a large family of other matrices. More specifically, the method 
uses standard analytic properties of Boolean functions (such as approximate 
degree or threshold degree) to determine the discrepancy and generalized 
discrepancy of the associated communication problems. 

Originally formulated in |S 1 1. IS2| for the two-party model, the pattern 
matrix method has been adapted to the multiparty model by several au- 
thors ESI EH EE iDPVl IBH] , The first adaptation of the method to 
the multiparty model gave improved lower bounds for the multiparty dis- 
jointness function |LS|. ICAj . This line of work was combined in [ DP} IDPV| 
with probabilistic arguments to separate the classes NP£ C and BPP£ C for up 
to k = (1 — e) log 2 n players, by an explicit function. A new paper [BHJ gives 
polynomial lower bounds for constant-depth circuits, in the model with up 
to k = elogn players. Further details on this body of research and other 
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duality-based approaches |SZ] can be found in the survey article [S3] . 

1.3 Our Approach 

To obtain our main result, we combine the work in \DP\ IDPV] with sev- 
eral new ideas. First, we derive a new criterion for high nondeterministic 
communication complexity, inspired by the Klauck-Razborov generalized 
discrepancy method |KH IR3] . Similar to Klauck-Razborov, we also look 
for a hard function H that is well-correlated with the function F of in- 
terest, but we additionally quantify the agreement of H and F on the set 
1). This agreement ensures that 1) does not have a small cover 

by cylinder intersections, thus placing F outside NP£ C . To handle the more 
powerful Merlin- Arthur model, we combine this development with an earlier 
technique [K2| for proving lower bounds against two-party Merlin-Arthur 
protocols. 

In keeping with the philosophy of the pattern matrix method, we then 
reformulate the agreement requirement for H and F as a suitable analytic 
property of the underlying Boolean function / and prove this property di- 
rectly, using linear programming duality. The function / in question hap- 
pens to be OR. 

Finally, we apply our program to the specific function F constructed 
in [DPV] for the purpose of separating NP£ C and BPP£ C . Since F has small 
nondeterministic complexity by design, the proof of our main result is com- 
plete once we apply our machinery to — F and derive a lower bound on 
MA(—F). 

1.4 Organization 

We start in Section [2] with relevant technical preliminaries and standard 
background on multiparty communication complexity. In Section we re- 
view the original discrepancy method, the generalized discrepancy method, 
and the pattern matrix method. In Section [H we derive the new criterion for 
high nondeterministic and Merlin-Arthur communication complexity. The 
proof of Theorem 11.11 comes next, in Section [5j In the final section of the 
paper, we explore some implications of this work in light of other multiparty 
papers [C| iLSl ICAl iBHj . 
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2 Preliminaries 



We view Boolean functions as mappings X — > {— 1, +1}, where X is a finite 
set such as X = {0, l} n or X = {0, l} n x {0, l} n . We identify -1 and +1 
with "true" and "false," respectively. The notation [n] stands for the set 
{1,2,... , n}. For integers N, n with N ^ n, the symbol (^}) denotes the 
family of all size-n subsets of {1, 2, . . . , N}. For a string x G {— 1,+1} 
and a set S G f^) , we define x|g = (xi 1 ,x i2 , . . . , x in ) G {— 1, +l} n , where 
z'l < %2 < ■ ■ ■ < in are the elements of S. For x G {0, l} n , we write |x| = 
X\ + • • • + x n . Throughout this manuscript, "log" refers to the logarithm to 
base 2. For a function / : X — > K, where X is an arbitrary finite set, we 
write H/lloo = max^gx 1/0*01- 

We will need the following observation regarding discrete probability 
distributions on the hypercube, cf. |Slj . 

Proposition 2.1. Let fi(x) be a probability distribution on {0, l} n . Fix 
i%, . . . , i n G {1, 2, . . . , n}. Then 

x<={0,l} n 

For functions /, g : X\ x • • • x — >• M (where is a finite set, i = 
1,2, .. . , k), we define (/,#) = J2( Xl ,...,x k ) f( x U-- -^ k )g{xi,. . . ,x k ). When / 
and g are vectors or matrices, this is the standard definition of inner product. 
The Hadamard product of / and g is the tensor fog: X\ x • • • x X^ — > K 
given by (fog)(x 1 ,. ..,x k ) = f(xi, . . . ,x k )g(xi,. . .,x k ). 

The symbol M. mxn refers to the family of all m x n matrices with real 
entries. The (i,j)th entry of a matrix A is denoted by Aij. In most matrices 
that arise in this work, the exact ordering of the columns (and rows) is irrel- 
evant. In such cases, we describe a matrix using the notation [F(i, j)]ie/, jeJ> 
where / and J are some index sets. 

We conclude with a review of the Fourier transform over ZJJ. Consider 
the vector space of functions {0, 1}™ —> M, equipped with the inner product 
(f,g) = 2- n J2f(x)g(x). For S C [n], define X s : {0, l} n -)• {-1,+1} by 
Xs( x ) = ( — l)^ies Xi . Then {xs}sc[n] is an orthonormal basis for the inner 
product space in question. As a result, every function / : {0, l} n — > R 
has a unique representation of the form / = Ylsc[n] f($) Xs, where f(S) = 
(/) Xs)- The reals f(S) are called the Fourier coefficients of /. The following 
fact is immediate from the definition of f(S): 
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Proposition 2.2. Fix f : {0, l} n ->■ E. T/ien 



max < 2-" £ |/( 

xe{o,i} n 



2.1 Communication Complexity 

An excellent reference on communication complexity is the monograph by 
Kushilevitz and Nisan |KNj . In this overview, we will limit ourselves to 
key definitions and notation. The simplest model of communication in this 
work is the two-party randomized model. Consider a function F : X x Y — > 
{ — 1,+1}, where X and Y are finite sets. Alice receives an input x G X, 
Bob receives y G Y, and their objective is to predict F(x, y) with high 
accuracy. To this end, Alice and Bob share a communication channel and 
have an unlimited supply of shared random bits. Alice and Bob's protocol 
is said to have error e if on every input (x,y), the computed output differs 
from the correct answer F(x, y) with probability no greater than e. The 
cost of a given protocol is the maximum number of bits exchanged on any 
input. The randomized communication complexity of F, denoted R e (F), is 
the least cost of an e-error protocol for F. It is standard practice to use the 
shorthand R(F) = Ri/ S (F). Recall that the error probability of a protocol 
can be decreased from 1/3 to any other positive constant at the expense of 
increasing the communication cost by a constant factor. We will use this 
fact in our proofs without further mention. 

A generalization of two-party communication is the multiparty number- 
on-forehead model of communication. Here one considers a function F : 
X\ x ••• x X}~ —7- {—1,-|-1} for some finite sets Xi, . . . , X^. There are k 
players. A given input {x%, . . . ,Xk) G X\ x • • • x X^ is distributed among 
the players by placing Xi on the forehead of player i (for i = 1, . . . , k) . 
In other words, player i knows x\, . . . , a^-i, Xj+i, . . . , Xfc but not Xj. The 
players communicate by writing bits on a shared blackboard, visible to all. 
They additionally have access to a shared source of random bits. Their goal 
is to devise a communication protocol that will allow them to accurately 
predict the value of F on every input. Analogous to the two-party case, the 
randomized communication complexity R t (F) is the least cost of an e-error 
communication protocol for F in this model, and R(F) = Ri/ 3 (F). 

Another model in this paper is the number-on-forehead nondeterministic 
model. As before, one considers a function F : X\ x • • • x X^ — > {—1, +1} 
for some finite sets X\ , . . . , X^ . An input from X\ X • • • X X^ is distributed 
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among the k players as before. At the start of the protocol, c\ unbiased 
nondeterministic bits appear on the shared blackboard. Given the values 
of those bits, the players behave deterministically, exchanging an additional 
C2 bits by writing them on the blackboard. A nondeterministic protocol 
for F must output the correct answer for at least one nondeterministic 
choice of the c\ bits when F(x±, . . . ,Xk) = —1 and for all possible choices 
when F(x\, . . . ,x^) = +1. The cost of a nondeterministic protocol is de- 
fined as ci + C2- The nondeterministic communication complexity of F, 
denoted N(F), is the least cost of a nondeterministic protocol for F. The 
co-nondeterministic communication complexity of F is the quantity N(—F). 

The number-on-forehead Merlin-Arthur model combines the power of 
randomized and nondeterministic models. Similar to the nondeterministic 
case, the protocol starts with a nondeterministic guess of c\ bits, followed by 
C2 bits of communication. However, the communication can be randomized, 
and the requirement is that the error probability be at most e for at least 
one nondeterministic choice when F(x±, . . . , x^) = — 1 and for all possible 
nondeterministic choices when F(x\, . . . ,Xk) = +1. The cost of a protocol 
is defined as c\ + ci- The Merlin- Arthur communication complexity of F, 
denoted MA t (F), is the least cost of an e-error Merlin- Arthur protocol for 
F. We put MA(F) = MA 1/3 (F). Clearly, MA(F) ^ mm{N (F) , R(F)} for 



Analogous to computational complexity, one defines BPP£ C , NP£ C , 
coNP£ c , and MAf as the classes of functions F : ({0, l} n ) k -»• 
{ — 1,+1} with complexity log°^ n in the randomized, nondeterministic, 
co-nondeterministic, and Merlin- Arthur models, respectively. 

3 Generalized Discrepancy and Pattern Matrices 

A common tool for proving communication lower bounds is the discrepancy 
method. Given a function F : X x Y — > {— 1, +1} and a distribution fi on 
1x7, the discrepancy of F with respect to fx is defined as 



This definition generalizes to the multiparty case as follows. Consider a 
function F : X\ x ■ ■ ■ x — > {— 1, +1} and a distribution ji on X\ x ■ ■ ■ x Xj,. 



every F. 



disc M (F) 



= max 

sex, 

TCY 
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The discrepancy of F with respect to \i is defined as 



disc A1 (F) = max 



^ fJ-(xi,. ■ ■ , x k )F(x 1 ,. . . , x k )x(x!, ...,x k ) 



(x 1 ,...,x k ) 
£XiX---xX k 



where the maximum ranges over functions x '■ X± x • • • x X k — > {0, 1} of the 
form 



X(X1, ... ,X k ) = JJ<^i(xi, . . . ,Xi-l,X i+ l, ... ,x k ) 



(3.1) 



for some fa : X\ x • • ■ x Xi + \ x • • ■ Xf~ — »■ {0, 1}, i = 1,2,... ,k. A 
function x °f t ne form (|3.1|) is called a rectangle for k = 2 and a cylinder 
intersection for A; ^ 3. Note that for k = 2, the multiparty definition of 
discrepancy agrees with the one given earlier for the two-party model. We 
put 

disc(i ? ) = mindisc„(F). 



Discrepancy is difficult to analyze as defined. Typically, one uses the 
following estimate, derived by repeated applications of the Cauchy-Schwarz 
inequality. 

Theorem 3.1 ( [BN51 lUTllRl] ) . Fix F : X 1 x • • • xX k -> {-1, +1} and a dis- 
tribution \i on X\ x • • • x X k . Put i/j(xi, . . . , = F{x\, . . . , x k )fj,(xi, . . . ,x k ). 
Then 



disc, 



l^i-.-iXfci; 



< E ••• E 

x i &x i zj^eXk-i 



E [] ^{x\\...,x z k ^,x k ) 
ze{o,i} fe - 1 



In the case of fc = 2 parties, there are other ways to estimate the discrepancy, 
including the spectral norm of a matrix (e.g., see |S2j). 

For a function F : X\ x • • • x X k — > { — 1, +1} and a distribution jj, over 
X\ x • • • x X k , let Dt{F) denote the least cost of a deterministic protocol for 
F whose probability of error with respect to \x is at most e. This quantity is 
known as the fi- distributional complexity of F. Since a randomized protocol 
can be viewed as a probability distribution over deterministic protocols, we 
immediately have that R e (F) ^ max^ D^(F). We are now ready to state 
the discrepancy method. 
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Theorem 3.2 (Discrepancy method; see |KN| ). For every F : X\ x • • • x 
Xk — > { — 1, +1}, every distribution fi on X\ x • • • x X^, and < 7 ^ 1, 



In words, a function with small discrepancy is hard to compute to any 
nontrivial advantage over random guessing, let alone compute it to high 
accuracy. 



3.1 Generalized Discrepancy Method 

The discrepancy method is particularly strong in that it gives commu- 
nication lower bounds not only for bounded-error protocols but also for 
protocols with error vanishingly close to ^. This strength of the discrep- 
ancy method is at once a weakness. For example, the disjointness function 
DISj(x,y) = \/r=i( x * A yi) has a randomized protocol with error ^ — (-) 
and communication O(logn). As a result, the disjointness function has high 
discrepancy, and no strong lower bounds can be obtained for it via the dis- 
crepancy method. Yet it is well-known that DISJ has communication com- 
plexity 0(n) in the randomized model [KS\ IR2| and Sl{-^fn) in the quantum 
model |R3| and Merlin- Arthur model |K2j . 

The generalized discrepancy method is an extension of the traditional 
discrepancy method that avoids the difficulty just cited. This technique 
was first applied by Klauck [Kl] and reformulated in its current form by 
Razborov |R3| . The development in [Kll IR3] takes place in the quantum 
model of communication. However, the same idea works in a variety of 
models, as illustrated in [S2]. The version of the generalized discrepancy 
method for the two-party randomized model is as follows. 

Theorem 3.3 ([HI §2.4]). Fix a function F : X x Y -> {-1,+1} and 
^ e < 1/2. Then for all functions H : XxY — > {— 1,-1-1} and all probability 
distributions P on X x Y, 



The usefulness of Theorem 13.31 stems from its applicability to functions that 
have efficient protocols with error close to random guessing, such as i — 
(i) for the disjointness function. Note that one recovers Theorem 13.21 
the ordinary discrepancy method, by setting H = F in Theorem 13.31 



S 



Proof of Theorem GGH (adapted from [32], pp. 88-89). Put c = R e (F). A 
public-coin protocol with cost c can be thought of as a probability dis- 
tribution on deterministic protocols with cost at most c. In particular, there 
are random variables X^X^i ■ ■ ■ i%y C '■ X x Y — > {0, 1}, each a rectangle, as 
well as random variables 0i,<7 2 > • • • >^2 c e { — 1> +1}; such that 



^ 2e. 

oo 



F - E [E^ 

Therefore, 

( F " E [E^] ,#<>p) <2e. 

On the other hand, 

(F-E\22 aaj ,HoP^(F,HoP)-2 c discp(H) 

by the definition of discrepancy. The theorem follows at once from the last 
two inequalities. □ 

Theorem 13.31 extends word-for-word to the multiparty model, as follows: 

Theorem 3.4 ([LSI EA]). Fix a function F : X ->• {-1,+1} and e £ 
[0, 1/2), w/iere X = Xl x • • • x X^. Then for all functions H : X — )■ {— 1, +1} 
and a// probability distributions P on X, 

(F, HoP)-2e 



Re(F) ^ log ■ 



discp(-ff) 



Proof. Identical to the two-party case (Theorem 13. 3p . with the word "rect- 
angles" replaced by "cylinder intersections." □ 

3.2 Pattern Matrix Method 

To apply the generalized discrepancy method to a given Boolean function F, 
one needs to identify a Boolean function H which is well correlated with F 
under some distribution P but has low discrepancy with respect to P. The 
pattern matrix method [SI L |S2] is a systematic technique for finding such H 
and F. To simplify the exposition of our main results, we will now review 
this method and sketch its proof. 

Recall that the e- approximate degree of a function / : {0, l} n — > M, 
denoted deg e (/), is the least degree of a polynomial p with ||/ — p||oo ^ e. A 
starting point in the pattern matrix method is the following dual formulation 
of the approximate degree. 
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Fact 3.5. Fix e ^ 0. Let f : {0, l} n ^ R be given with d = deg £ (/) ^ 1. 
Then there is a function tp : {0, l} n — > R such that: 

4>(S) = for\S\<d, 

E i^)i = i, 

ze{o,i}™ 
ze{o,i} n 

See |S2j for a proof of this fact using linear programming duality. The crux 
of the method is the following theorem. 

Theorem 3.6 (|Slj). Fix a function h : {0,1}™ — > { — 1,+1} and a proba- 
bility distribution fx on {0, l} n such that 

h7li(S) = for \S\ < d. 

Let N be a given integer. Define 

H = [h(x\ V )] x y, P = 2- N+n {^\ Hx\ v )] x y, 

where the rows are indexed by x G {0, 1}^ and columns by V G ( „ )• Then 

discp(H) ^ 



4en 2 ^ /2 



Nd 



At last, we are ready to state the pattern matrix method. 



Theorem 3.7 ([S2]). Let f : {0, l} n {-1,+1} be a given function, 
d = deg 1 / 3 (/). Let N be a given integer. Define F = [f{x\y)\ x y, where the 

rows are indexed by x 6 {0, 1}^ and columns by V G (^)- If N ^ 16ere 2 /<i, 

r ivd r 



,r(f) = n diog 



\ 4era 2 J 
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Proof (adapted from [S2J). Let e = 1/10. By Fact 13.51 there exists a func- 
tion h : {0, l} n — > {—1, +1} and a probability distribution fi on {0, l} n such 
that 

h^i{S) = 0, \S\<d, (3.2) 

and 

£ /(^)/^) > ~ (3.3) 

zG{0,l} n '* 

Letting # = [/t(aj|y)]a;,y and P = 2~ N+n {^) \}i{x\y)\ Xj y , we obtain 
from (I3.2p and Theorem 13.61 that 

disc P (H) ^ . (3.4) 

At the same time, one sees from (|3.3p that 

(F,HoP)>± (3.5) 

The theorem now follows from (|3.4p and (|3.5p in view of the generalized 
discrepancy method, Theorem 13.31 □ 

Remark. Presented above is a weaker, combinatorial version of the pattern 
matrix method. The communication lower bounds in Theorems 13.61 and 13.71 
were improved to optimal in [S2] using matrix-analytic techniques. Unlike 
the combinatorial argument above, however, the matrix-analytic proof is not 
known to extend to the multiparty model and is not used in the follow-up 
multiparty papers ESI ESI EE iDPVl IBH] or our work. 

An alternate technique based on Fact 13.51 is the block- composition 
method [SZ], developed independently of the pattern matrix method. 
See [S3|. §5.3] for a comparative discussion. 

4 A New Criterion for Nondeterministic and 
Merlin- 
Arthur Complexity 

In this section, we derive a new criterion for high communication complexity 
in the nondeterministic and Merlin-Arthur models. This criterion, inspired 
by the generalized discrepancy method, will allow us to obtain our main 
result. 
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Theorem 4.1. Let F : X — > {—1, +1} be given, where X = X\ x ■ ■ ■ x X^. 

Fix a function H : X — > { — 1,+1} and a probability distribution P on X. 
Put 



Then 



and 



a = P(F~ 1 (-l)nH-\-l)), 

/3 = P(F- 1 (-i)n J ff- 1 (+i)), 

Q = log W^ury 

N(F) > Q (4.1) 

M^ ) ^ m in{^ ) ^( i -| 7 - y )}. (4.2) 



Proof. Put c = N(F). Then there is a cover of i ?_1 (— 1) by 2 C cylinder in- 
tersections, each contained in F _1 (— 1). Fix one such cover, xi> X2, ■ ■ ■ , X2 c '■ 
X — > {0, 1}. By the definition of discrepancy, 

(E^-HoP) ^2 c disc P (tf). 

On the other hand, ^ Xi ranges between 1 and 2 C on F _1 (— 1) and vanishes 
on F _1 (+l). Therefore, 

(EXi,-HoP) >a-2 c p. 

These two inequalities force (I4.ip . 

We now turn to the Merlin- Arthur model. Let c = MA{F) and 
5 = a2~ c ~ 1 . The first step is to improve the error probability of the 
Merlin- Arthur protocol by repetition from 1/3 to 5. Specifically, following 
Klauck [K2| we observe that there exist randomized protocols F\ , . . . , F^ '■ 
X — > {0,1}, each a random variable of the coin tosses and each having 
communication cost c' = 0(clog{l/<5}), such that the sum 

ranges in [1 — 5, 2 C ] on F _1 (— 1) and in [0, 52 c ] on F _1 (+l). As a result, 

(J2 E I^], -HoP^ a (l - 5) - /32 c - (1 - a - P)82 c . (4.3) 
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At the same time, 

2 C 

E[Fj], —H o p\ < £Vdisc P (iO = 2 C+C ' discp(H). (4.4) 



The bounds in (jOJ and flH]) force (JOD- □ 

Since sign tensors H and — H have the same discrepancy under any given 
distribution, we have the following alternate form of Theorem 14.11 

Corollary 4.2. Let F : X — > {— 1, +1} fee given, where X = X\ X • • • X A&. 

Fix a function H : X — > { — 1,+1} and a probability distribution P on X. 
Put 

a = P(F- 1 (+l)n J ff- 1 (+l)), 

/3 = p(F- 1 (+i)n J ff- 1 (-i)), 
Q = iog 



Then 



and 



N(-F) ^ Q 



MA(-F)^min{0(VQ),0( I3 J 7 ^)} 



At first glance, it is unclear how the nondeterministic bound of Theo- 
rem 14.11 and its counterpart Corollary 14.21 relate to the generalized discrep- 
ancy method. We now pause to make this relationship quite explicit. Recall 
that nondeterminism is a kind of randomized computation, viz., a nondeter- 
ministic protocol with cost c for a function F is a kind of cost-c randomized 
protocol with error probability at most e = \ ~ 2~ c on 1) and error 

probability e = elsewhere. This is the setting of Theorem 14.11 The gener- 
alized discrepancy method, on the other hand, has a single error parameter 
e for all inputs. To best convey this distinction between the two methods, 
we formulate a more general criterion yet, which allows for different errors 
on each input. 
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Theorem 4.3. Let F : X ->■ {-1, +1} be given, where X = X± x ■ ■ ■ x X k . 

Let c be the least cost of a public-coin protocol for F with error probability 
E(x) on x £ X, for some E : X — > [0, 1/2]. Then for all functions H : X — >■ 
{ — 1,+1} and all probability distributions P on X, 

(F,HoP)-2(P,E) 



2 C > 



discp(H) 



Proof. A public-coin protocol with cost c is a probability distribution on 
deterministic protocols with cost at most c. Then by hypothesis, there are 
random variables x 1 , x 1 t ■ ■ ■ > X^c '■ X — > {0, 1}, each a cylinder intersection, 
and random variables a 1; a 2 , ■ ■ ■ ,cr 2 c £ { — 1, +1}, such that 

F(s)-E[£ a £(a;)]| <2£(a 

Therefore, 

(f-e[5>2* 

On the other hand, 

by the definition of discrepancy. The theorem follows at once from the last 
two inequalities. □ 



for x £ X. 
HoPj ^ 2{P,E). 
,HoP)^(F,HoP)- 2 c disc P (#) 



5 Main Result 

We now prove the claimed separations of nondeterministic, co- 
nondeterministic, and Merlin-Arthur communication complexity. It will be 
easier to first obtain these separations by a probabilistic argument and only 
then sketch an explicit construction. 

We start by deriving a suitable analytic property of the OR function. 

Theorem 5.1. There is a function ip : {0, l} m — > R such that: 

E iv^)i = i, (5-i) 

ze{o,i} m 

j>(S) = for\S\^Q(,/m), (5.2) 

^(0) > (5.3) 
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Proof. Let / : {0, l} m -»• {-1, +1} be given by f{z) = 1 z = 0. It is well- 
known |NS[ IF] that deg]/ 3 (/) J)(y / m). By Fact 13.51 there is a function 
ip : {0, l} m -> R that obeys (|5.ip . (|5.2p . and additionally satisfies 



£ ^z)f[z)>\. 
^e{o,i} m 



Finally, 



2^(0) 



1>(z){f(z) + l} 

ze{o,i} m 



J2 *Kz)f(z)>±, 
ze{o,i} m 



where the second equality follows from ?/>(0) = 0. 



□ 



For the remainder of this section, it will be convenient to establish some 
additional notation following David and Pitassi [DP] . Fix integers n, m with 
n > m. Let ip : {0, l} m — > R be a given function with X«e{0 i} m l^( z )l = 
Let d denote the least order of a nonzero Fourier coefficient of ip. Fix a 
Boolean function /i : {0, l} m — >■ { — 1, +1} and the distribution fi on {0, l} m 
such that = h(z)fi(z). For a mapping a : ({0, l} n ) fc — > (^), define 

a (fc + l)-party communication problem : ({0, l} n ) fc+1 — > { — 1,+1} by 
H a (x,yi,...,y k ) = h( x \a(yi,...,y k ))- Defme a distribution P a on ({0, l} n ) fe+1 
by P a (x,y 1 ,...,y k ) = 2- ( - k+ ^ n+m fi(x\ a{yu _ jyk) ). The following theorem 
combines the pattern matrix method with a probabilistic argument. 

Theorem 5.2 ( [DP] ). Assume that n ^ 16em 2 2 fc . T/ien /or a uniformly 
random choice of a : ({0, \} n ) k ->■ (t n '), 



E 



discp Q 



< 2 _n / 2 + 2 _d2fc+1 



For completeness, we include a detailed proof of this result. 

Proof (reproduced from the survey article [S3] . pp. 88-89). By Theo- 
rem [XT] 

discp a (^) 2fc ^2 m2fc E|r(Y)|, (5.4) 
where we put Y = (y\,y\, ■ ■ ■ ,y k , y\) G ({0, l} n ) 2fc and 



T(Y) = E 



n * 

ze{o,i} fe 
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For a fixed choice of a and Y, we will use the shorthand S z = a{y^ , . . . , y^). 
To analyze T(Y), one proves two key claims analogous to those in the two- 
party Theorem 13.61 (see [Sl|, [S3] for more detail) . 



Claim 5.3. Assume that 



6{0,1} 



k S z 



> m2 k - d2 k ~ 1 . Then T(Y) = 0. 



Proof. If | (J S z | > m2 k — d2 k ~ 1 , then some S z must feature more than m — d 
elements that do not occur in \J U + Z S u . But this forces T(Y) = since the 
Fourier transform of tp is supported on characters of order d and higher. □ 



Claim 5.4. For every Y, \T(Y)\ < 2~l uS *l. 
Proof. Immediate from Proposition 12. 11 

In view of (|5.4[) and Claims 15.31 and 15.41 we have 



□ 



E 

a 



discp Q (H 



m2 — m 

< V 2* P 

i=d2 fc - 1 



[|U 



S 2 



m2 h 



It remains to bound the probabilities in the last expression. With probability 
at least 1 — k2~ n over the choice of Y, we have y® ^ yj for each i = 1, 2, . . . , k. 
Conditioning on this event, the fact that a is chosen uniformly at random 
means that the 2 k sets S z are distributed independently and uniformly over 
A calculation now reveals that 



P 

Y,a 



DU 



s z 



m2 K 



< k2- n + 



m2 A 



m2* 



n 



< fc2" n + 8" 



□ 



We are ready to prove our main result. It may be helpful to contrast the 
proof to follow with the proof of the pattern matrix method (Theorem [37) 



Theorem 5.5. Let k ^ (1 — e)logn, where e > is any given constant. 
Then there exists a function F a : ({0, l} n ) fc+1 — > { — 1, H-l}- such that: 



N(F a ) = 0(logn) 



and 



MA(-F Q 



n 



Q(l) 



(5.5) 



(5.6) 



In particular, coNP£ c £ MAf and NPf / coNP 
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Proof. Let m = [n J for a sufficiently small constant S = 5(e) > 0. As 
usual, define OR m : {0, l} m — > { — 1,+1} by OR m (z) = 1 44> z = 0. Let 
V> : {0, l} m — )• R be as guaranteed by Theorem 15.11 For a mapping a : 
({0, l}") fc — )• ("), let and P Q be denned in terms of ^ as described 
earlier in this section. Then Theorem 15.21 shows the existence of a such that 

discp Q (# Q K2-^). (5.7) 

Define F a : ({0,l} n ) k+1 -> {-1,+1} by F a (x, yi, . . . , y fc ) = 
OR m (x| Q ,( ?/lj . y k )). It is immediate from the properties of V> that 

P Q (F- 1 (+i)n J ff- 1 (+i))>i (5.8) 
p Q (F- 1 (+i)n J ff- 1 (-i)) = o. (5.9) 

The sought lower bound in (|5.6p now follows from (|5.7p - (|5.9p and Corol- 
lary H2I 

On the other hand, as observed in [DPj . the function F a has an effi- 
cient nondeterministic protocol. Namely, player 1 (who knows yi, ■ ■ ■ ,Vk) 
nondeterministically selects an element i E a>(yi, . . . ,yj.) and writes i on the 
shared blackboard. Player 2 (who knows x) then announces Xi as the output 
of the protocol. This yields the desired upper bound in (15.5P . □ 

As promised, we will now sketch an explicit construction of the function 
whose existence has just been proven. For this, it suffices to invoke previous 
work by David, Pitassi, and Viola [DP V] . who derandomized the choice of 
a in Theorem 15.21 More precisely, instead of working with a family {H a } 
of functions, each given by H a (x, y±, . . . , yu) = h( x \a{yi,...,y k ))i * ne au thors 
of [DPV| posited a single function H(a, x, y%, . . . , yt) = h(x\ a r yit ___ t y k \), where 
the new argument a is known to all players and ranges over a small, explicitly 
given subset A of all mappings ({0,l} n ) fc ->• ( [ ™ ] ). By choosing A to be 
pseudorandom, the authors of |DPV| forced the same qualitative conclusion 
in Theorem 15. 2[ This development carries over unchanged to our setting, 
and we obtain our main result. 

Theorem 11.11 (Restated from p. [T]). Let k ^ (1 — e)log 2 n, where e > 
is any given constant. Then there is an (explicitly given) function F : 
({0,l}") fc -> {-1,+1} with 

N(-F) = O(logn) 
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and 

MA(F) = n n ^\ 
In particular, coNPf MAf and UPf ^ coNP£ c . 

Proof. Identical to Theorem 15.51 with the described derandomization of a. 

□ 

6 On Disjointness and Constant-Depth Circuits 

In this final section, we revisit recent multiparty analyses of the disjointness 
function and other constant-depth circuits [Cl ILSt ICA[ |BH] . We will see 
that the program of the previous sections applies essentially unchanged to 
these other functions. 

We start with some notation. Fix a function <j) : {0, l} m — > M and 
an integer N with m \ N. Define the (k,N,m, <p) -pattern tensor as the k- 
argument function A : {0, \} m { N / m ) k ~ 1 x [N/m] m x • • • x [N/m] m ->■ R given 
by A(x,Vi,. ■ ■ , Vfe-i) = <j){x\v u ...,v k _ x ), where 

^|vi,...,V fc _i = [%i,v 1 [i],...,v k _ 1 [i], VftH,,K t -iH) e {0,1}™ 

and Vj[i] denotes the ith element of the m-dimensional vector Vj. (Note 
that we index the string x by viewing it as a /c-dimensional array of m x 
(iV/m) x • • • x (N/m) = m(N/m) k ~ l bits.) This definition extends pattern 
matrices |S1|, IS2] to higher dimensions. The two-party Theorem 13.61 has 
been adapted as follows to k ^ 3 players. 

Theorem 6.1 (0 [LSI ESQ). Fix a function h : {0, l} m {-1,+1} and a 
probability distribution /x on {0, l} m such that 

h^Ji(S) = 0, \S\ < d. 

Let N be a given integer, m \ N. Let H be the (k, N, m, h)-pattern tensor. Let 
P be the (k, N, m, 2- m ^ N / m ^' 1+m (N/m)- m( - k - 1 ^)-tensor. If N ^ Aem 2 {k- 
l)2 2fc_1 /d, then 

discp(F) ^2- d / 2k '\ 

A proof of this exact formulation is available in the survey article [S3] , 
pp. 85-86. We are now prepared to apply our techniques to the disjointness 
function. 
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Theorem 6.2. Let N be a given integer, m \ N. Let F be the (k, N, m, OR m )- 
pattern tensor. Lf N ^ 4em 2 (/c — 1)2 2 1 /d, then 

n(-f) > n (^§) , ma(-f) * n (Jg) . 

Proof. Let V : {0, l}" 1 — > M. be as guaranteed by Theorem 15.11 Fix a func- 
tion h : {0, l} m — > {— 1,-1-1} and a distribution yu on {0, l} m such that 
ip(z) = h(z)n(z). Let H be the (k, N, m, /i)-pattern tensor. Let P be the 
(k, N, m, 2~ m ( N / m ) +m (A^/m)~ m ( fc_1 )^)-pattern tensor, which is a proba- 
bility distribution. Then by Theorem 16 .1| 

discp(H) ^ 2 ~ n ^ 2k \ (6.1) 

On the other hand, it is clear from the properties of ip that 

P(F- 1 (+i)ni?- 1 (+i))>i (6.2) 

b 

P(F~ 1 {+l)nH~ 1 (-l))=0. (6.3) 
In view of f)6. 1 j) — f)6.3j) and Corollary 14.21 the proof is complete. □ 

The function F in Theorem 16.21 is a subfunction of the multiparty dis- 
jointness function disj : ({0, l} n ) k — > {— 1,+1}, where n = m(N/m) k ~ 1 
and 

n k 

disj(xi, ...,x k ) = V A Xi i- 

j=\%=\ 

Recall that disjointness has trivial nondeterministic complexity, O(logn). In 
particular, Theorem 16.21 shows that the disjointness function separates NP^ C 
from coNP£ c and witnesses that coNP£ c ^ MA£ C for up to k = ©(log log n) 
players. Our technique similarly applies to the follow-up work on disjoint- 
ness by Beame and Huynh-Ngoc [BH], whence we obtain the stronger con- 
sequence that the disjointness function separates NP£ C from coNP£ c and wit- 
nesses that coNP£ c ^ MA£ C for up to k = ©(log 1 / 3 n) players. 

We conclude this section with a remark on constant-depth circuits. 
Let e be a sufficiently small absolute constant, < e < 1. For each 
k = 2, 3, . . . , elogn, the authors of [BH] construct a constant-depth cir- 
cuit F : ({0, l} n ) k -> {-1, +1} with N(F) = log° (1) n and R(F) = n n ^ . A 
glance at the proof in [BH] reveals, once again, that the program of our pa- 
per is readily applicable to F, with the consequence that MA(— F) = 
In particular, our work shows that HPf / coNPf and coNPf £ MAf for 
up to k = elogn players, as witnessed by a constant-depth circuit. 
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